Generalized Discriminative Training for Speech Recognition

نویسندگان

  • Roger Hsiao
  • Alan Black
  • Florian Metze
چکیده

In speech recognition, discriminative training has proved to be an effective method to improve recognition accuracy. It has successfully improved systems of different scales and different languages. While discriminative training has been developing for over 20 years, it continues to draw attention to researchers and remains to be one of the most important topics in speech recognition to date. Discriminative training aims to directly minimize the errors made by the generative models. It is often formulated as an optimization problem which involves the reference and the competing hypotheses. The goal of the optimization is to search for the model parameters which can minimize the error on the train set, and the error is often represented by some smoothed error functions. While being an effective method, discriminative training comes with a few drawbacks. First, the optimization problem is difficult to solve due to the complex objective functions. This leads to the need of heuristics and smoothing techniques in the optimization algorithms. Second, discriminative training is much more time consuming compared to the conventional maximum likelihood (ML) approach, since in addition to the reference, discriminative training also considers the competitors. The prolonged training time becomes an even bigger issue when one incorporates both model space and feature space discriminative training. The goal of this thesis is to propose a family of optimization algorithms which are simple and efficient. When tuning and heuristics become necessary, the theories behind the algorithms should explain the meaning of the parameters, and give the users some basic ideas about tuning instead of using a pure empirical approach. Therefore, we reformulate the optimization problem for discriminative training, and propose new optimization algorithms based on Lagrange relaxation. In which, we relax the difficult optimization problems into simpler convex problems. We propose the generalized Baum-Welch (GBW) algorithm for model space discriminative training and the generalized discriminative feature transformation (GDFT) for feature space discriminative training. The GBW algorithm generalizes the Baum-Welch (BW) and the extended Baum-Welch (EBW) algo-

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Improvements to generalized discriminative feature transformation for speech recognition

Generalized Discriminative Feature Transformation (GDFT) is a feature space discriminative training algorithm for automatic speech recognition (ASR). GDFT uses Lagrange relaxation to transform the constrained maximum likelihood linear regression (CMLLR) algorithm for feature space discriminative training. This paper presents recent improvements on GDFT, which are achieved by regularization to t...

متن کامل

A comparative study on maximum entropy and discriminative training for acoustic modeling in automatic speech recognition

While Maximum Entropy (ME) based learning procedures have been successfully applied to text based natural language processing, there are only little investigations on using ME for acoustic modeling in automatic speech recognition. In this paper we show that the well known Generalized Iterative Scaling (GIS) algorithm can be used as an alternative method to discriminatively train the parameters ...

متن کامل

A log-linear discriminative modeling framework for speech recognition

Conventional speech recognition systems are based on Gaussian hidden Markov models (HMMs). Discriminative techniques such as log-linear modeling have been investigated in speech recognition only recently. This thesis establishes a log-linear modeling framework in the context of discriminative training criteria, with examples from continuous speech recognition, part-of-speech tagging, and handwr...

متن کامل

Discriminative training of stochastic Markov graphs for speech recognition

This paper proposes the application of discriminative training techniques based on the Generalized Probabilistic Descent (GPD) approach to Stochastic Markov Graphs (SMGs), a generalization of mixture-state Hidden Markov Models (HMMs), describing the constraints in the acoustic structure of speech as a graph consisting of nodes, each containing a base function, and a transition network between t...

متن کامل

Discriminative training for continuous speech recognition

Discriminative training techniques for Hidden Markov Models were recently proposed and successfully applied for automatic speech recognition In this paper a discussion of the Minimum Classi cation Error and the Maximum Mu tual Information objective is presented An extended reesti mation formula is used for the HMM parameter update for both objective functions The discriminative training me thod...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010